Scatter Chart Statistics Report
With the Scatter Chart Statistics Report you can analyze the statistical information of the data in your scatter charts. To access the statistics report, right-mouse click on a scatter chart to open the scatter chart context menu and select
Statistics Report.... The form that opens shows the statistics for the scatter chart you selected to create the report for. Use the 'Chart' drop-down list to select any scatter chart in your solution.
In the 'Data Statistics' table, statistical information is provided for all data series in the scatter chart, with a separate column for each X and Y-Axis. When a filter is applied to a scatter chart, all the statistics are computed from the filtered data. The information in the report is automatically updated when you make changes to the source scatter chart.
Weighting
Select Unweighted to view the data statistics without weighting applied, or Weighted, to apply weighting to the data statistics. This setting is independent from weighting used in the regression model 'Autofit to data' calibration. Weights are only applied for data from a 3D grid, where the cell volumes are used as weights, and well-log data, where the sample spacings are used as weights. All other source types will be unweighted regardless of the setting.
Exporting a statistics report
You can export the displayed statistical information as a .csv file. To do so:
- Click on the
button to open the browser. - The 'File name' will be populated with the 'Chart' name selected in the drop-down list. The name is editable.
- Browse to the location where you want to store the exported file and click Save.
Statistics definitions
- Source type The selected source type of the data: 2D Grid, 3D Grid, 3D Mesh, Marker set, Tri-mesh, Polyline set, Point set, or Wellbore.
- Source The name of the data source object.
- Wellbores Included only if the source type is 'Wellbore'. The selected wellbores are listed.
- Property The name of the property.
- Realization The realization number or time step.
- Weighting This row indicates whether the statistics for the series are weighted. A chart may contain multiple series from different source types, not all of which can be weighted.
- Filter The name of the applied property-based filter.
- Regression model The name of the regression model.
- Regression type The regression model type: Exponential, Linear, Logarithmic, Polynomial, or Power-Law.
- Count The number of observations in the series.
- Effective sample size The Kish effective sample size for weighted data, the sum of weights squared, divided by the sum of squared weights:
For unweighted or equally weighted data the effective sample size is the same as the count.
- Minimum The smallest series value.
- Maximum The largest series value.
- P10 The 10th percentile of the data or model.
- P50 (Median) The 50th percentile of the data or model. Also known as the median.
- P90 The 90th percentile of the data or model.
- Mean The average of the data.
- Standard deviation This is the measure of how dispersed the data is in relation to the mean. The standard deviation of the data, a measure of distribution spread.
- Variance The square of the standard deviation.
- Skewness The skewness of the data, a measure of distribution asymmetry. Symmetric distributions have a skewness of zero. Distributions with asymmetric tails to the right are positively skewed. Distributions with asymmetric tails to the left are negatively skewed.
- Kurtosis The kurtosis of the data, a measure of the heaviness of the distribution tails. An untruncated Gaussian distribution has a kurtosis of 3. Distributions with kurtosis less than 3 have lighter tails than a Gaussian distribution. Distributions with kurtosis greater than 3 have heavier tails. Note: The kurtosis should not be confused with the “excess kurtosis,” another common measure of tail heaviness. The excess kurtosis is equal to the kurtosis minus 3.
- Covariance A measure of the strength of linear correlation between two variables. A covariance of zero indicates no correlation. A negative covariance indicates anti-correlation, that is, one variable tends to decrease when the other increases. Larger magnitude covariances indicate stronger correlations. The covariance cannot be larger than the product of the two variable’s standard deviations. The covariance of a variable with itself is the same as its variance.
- Correlation coefficient A normalized measure of the strength of linear correlation between two variables. The correlation coefficient is equal to the covariance divided by the standard deviations of both variables and will always be between -1 and 1.
- Coefficient of determination, R2 A measure of the goodness of fit of a model to its data. Larger values indicate better fits. The coefficient of determination of a regression model is computed between the Y-axis data of the series and the Y-axis predictions of the model. The largest possible value for a perfect fit is 1. A model that is just a constant equal to the average of the data, sometimes known as a “baseline model,” has a coefficient of determination of zero. A negative coefficient of determination indicates that the model is worse than a baseline model. Note: The coefficient of determination, sometimes known as the R squared, is often confused with the square of the correlation coefficient, but they are not the same thing except for certain special cases. One such special case is a linear model that has been calibrated to data with least squares regression.
Column header options
You can right-click the column header area and select from the context menu: